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Abstract. We present a photometric method for identifying 
stars, galaxies and quasars in multi-color surveys, which uses 
a library of J> 65000 color templates for comparison with ob- 
served objects. The method aims for extracting the information 
content of object colors in a statistically correct way, and per- 
forms a classification as well as a redshift estimation for galax- 
ies and quasars in a unified approach based on the same proba- 
bility density functions. For the redshift estimation, we employ 
an advanced version of the Minimum Error Variance estimator 
which determines the redshift error from the redshift dependent 
probability density function itself. 

The method was originally developed for the Calar Alto 
Deep Imaging Survey (CADIS), but is now used in a wide va- 
riety of survey projects. We checked its performance by spec- 
troscopy of CADIS objects, where the method provides high 
reliability (6 errors among 151 objects with R < 24), espe- 
cially for the quasar selection, and redshifts accurate within 
<r z Ri 0.03 for galaxies and a z « 0.1 for quasars. 

For an optimization of future survey efforts, a few model 
surveys are compared, which are designed to use the same to- 
tal amount of telescope time but different sets of broad-band 
and medium-band filters. Their performance is investigated by 
Monte-Carlo simulations as well as by analytic evaluation in 
terms of classification and redshift estimation. If photon noise 
were the only error source, broad-band surveys and medium- 
band surveys should perform equal, as long as they provide 
the same spectral coverage. In practice, medium-band surveys 
show superior performance due to their higher tolerance for 
calibration errors and cosmic variance. 

Finally, we discuss the relevance of color calibration and 
derive important conclusions for the issues of library design 
and choice of filters. The calibration accuracy poses strong con- 
straints on an accurate classification, which are most critical for 
surveys with few, broad and deeply exposed filters, but less se- 
vere for surveys with many, narrow and less deep filters. 
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1. Introduction 

Sky surveys are designed to provide statistical samples of as- 
tronomical objects, aiming for spatial overview, completeness 
and homogeneous datasets. Mostly they serve as a database for 
rather general conclusions about abundant objects, but another 
attractive role is allowing to search for rare and unusual objects. 
For both purposes, it is very useful to predict rather precisely 
the appearance of the different known types of objects. The ob- 
ject types can then be discriminated successfully, and allow to 
extract the information content from the survey. Also, unusual 
objects can be found as inconsistent with all known sorts of 
objects, but they might as well hide among the bulk of normal 
objects mimicking their appearance. 

In this picture, we of course want a survey to perform as 
reliable and as accurate as possible in measuring object char- 
acteristics like class, redshift or physical parameters. Since sur- 
veys aim typically for large samples upon which future detailed 
work is based, their results are often not extremely reliable and 
accurate for a given single object. But for a statistical analysis 
of large samples, we can usually do without perfect accuracy in 
the measurement of features and we can also accept occasional 
misclassifications. 

In astronomical surveys pointing off the galactic plane, ob- 
vious classes to start out with could basically be stars, galaxies, 
quasars and strange objects. These can be further differentiated 
into subclasses, based on physical characteristics derived from 
their morphology or spectral energy distribution (SED). There- 
fore, morphology and color or prominent spectral features are 
the typical observational criteria applied to survey data for clas- 
sifying the objects contained. 

Presently, surveys concentrate mostly on either imaging or 
spectroscopy. While spectroscopic surveys deliver a potentially 
high spectral resolution, they have expensive requirements for 
telescope time. Imaging multi-color surveys can expose a num- 
ber of filters consecutively, and deliver morphological informa- 
tion and crude spectral information for all objects contained in 
the field of view. 

Since the subject of this paper is the spectral information in 
multi-color surveys, we want to mention morphological infor- 
mation only briefly: The morphology is only of limited use for 
classifying objects into stars, galaxies and quasars: Objects ob- 
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served as clearly extended are certainly not single stars, but the 
smaller ones could either be galaxies, low-luminosity quasars, 
or chance projections of more than one object. Objects con- 
sistent with point-sources can be stars, compact galaxies or 
quasars. Also, the morphological differentiation depends on the 
seeing conditions and typically reaches not to the survey limits 
set by the photometry. 

The power of spectral classification in a multi-color survey 
depends both on the filter set used and the depth of the imaging, 
where the optimum choices are determined by the goal of the 
survey. If a survey aims at identifying only one type of object 
with characteristic colors, a tailored filter set can be designed. 
E. g., when looking exclusively for U-band dropouts (Steidel et 
al. 1995), the UGR filter set is certainly a very good choice. The 
performance of such a dropout survey depends mostly on the 
depth reached in the U-band, so the photon flux detection limit 
in U is the key figure. Also, number count studies are limited by 
the completeness limit in the filter of concern. Quasar search is 



very often done with color excess rules (Hazard 1990), where 



the limit is given by the flux errors combined from two or three 
filters. E.g., the evolution of quasars between redshift and 2.2 
was es tablished using the UV excess method (Schmidt & Green 
1983; Boyle et al. 1988| ). At higher redshift quasars display 
rather star-like broad-band colors, motivating more advanced 
approaches like the selection o f outliers in an n-dimensional 
color space ( Warren et al. 1991 ). 

If we now intend to focus different survey programs on a 
common patch of sky to maximise synergy effects from the 
various efforts, then we might as well combine the individual 
surveys into one that identifies every object, and avoid double 
work. Then we have to ask for a filter set which enables iden- 
tifying virtually every object above some magnitude limit un- 
ambigously. In this case, the key number for the performance 
is the magnitude limit for a successful classification as needed 
for various science applications. If the classification takes all 
available color data into account, like template fitting proce- 
dures do, then the flux limit of a single filter is not the only 
relevant number, since the performance will depend to a large 
extent on the filter choice. This applies also for the estimation 
of multi-color redshifts, an idea dating back to Baum (1962), 
who used nine-band photoelectric data to estimate the redshifts 
of galaxy clusters. 

Most multi-color surveys conducted to date obtained spec- 
tral information via broad-band photometry. They have been 
used e.g. to search for quasars or for high-redshift galaxies. 
However, they always needed follow-up spectroscopy to clarify 
the true nature of the candidates and to measure their redshift. 



The SLOAN Digital Sky Survey flYork et al. 2000D is now the 
most ambitious project to provide a broad-band color database, 
on which the astronomical community might perform a large 
number of "virtual surveys". 

So far, only very few survey projects make extensive use 
of medium-band and narrow-band photometry, e.g. the Calar 
Alto Deep Imaging Survey (Meisenheimer et al. 1998). Sur- 



resolution imaging spectroscopy. CADIS fostered the develop- 
ment of a scheme for spectral classification, that distinguishes 
stars, galaxies, quasars and strange objects. Simultaneously, it 
assigns multi-color redshifts to extragalactic objects. 

Using 162 spectroscopic identifications Wolf et al. (2000a, 
henceforth paper II) have shown, that it is reliable for virtually 
all objects above the 10-er limits of the CADIS survey. Also, 
the photometric redshifts are accurate enough (a z « 0.03 for 
galaxies and a z « 0.1 for quasars around the 10-rr limit), so 
that follow-up spectroscopy is not needed for a number of anal- 
yses, e.g. the derivation of galaxy luminosity functions (Fried 
et al. 2000). 

After this algorithm was developed for CADIS, it is now 
used for classification in additional projects. It provides multi- 
color redshifts in lensing studies of the cluster Abell 1689 (Dye 
et al. 2000), aiming at determining the cluster mass after iden- 
tifying cluster members and weakly lensed background ob- 
jects. It is also employed for an ongoing widefield survey to 
search for high-redshift quasars, to provide multi-color red- 
shifts for galaxy-galaxy lensing studies, to search for high- 
redshift galaxy clusters and to perform a census of L* galaxies 
at z w 1 (|Wolfetal. 2000b| ). 



veys like CADIS with typically 10 to 20 filters are sampling 
the visual spectrum with a resolution comparable to that of low 



The purpose of this paper is to present our classification 
scheme and discuss the optimization of its use for optimum 
survey strategies. The statistical algorithm for the scheme is 
presented in Sect. 2 and our choice for the template libraries is 
detailed in Sect. 3. In Sect. 4 we report on simulations of a few 
competitive filter sets and their expected classification perfor- 
mance. We include an analytic discussion on the comparison 
of filter sets and conclude that medium-band surveys are alto- 
gether more powerful, even when being limited by available 
telescope time. Sect. 5 outlines a few real datasets using this 
classification and draws conclusions about the expected perfor- 
mance. Paper II demonstrates real CADIS data based on which 
we gained experience during the development of the scheme, 
and show, that the conclusions from the simulations compare 
well to the real dataset. 

2. The classification algorithm 

2.1. General remarks on classification 

Generally speaking, classification is a process of pattern recog- 
nition which usually has to deal with noisy data. Mathemat- 
ically, a classifier is a function, which is mapping a feature 
vector of a measured object characteristics onto a discriminant 
vector, that contains the object's likelihoods for belonging to 
the different available classes. Any classification relies on the 
feature space being chosen such that different classes cover dif- 
ferent volumes and overlap as little as possible to avoid ambi- 
guities. 

If a survey is designed without class definitions in mind, 
it will be difficult to choose a set of measurable features for 
a tailored classification. Also, only unsupervised classifiers (= 
working without knowledge input) can be used to work on mea- 
sured object lists. In this case, a classifier can find distinguish- 
able classes, e.g. by cluster analysis. This process leads to a 
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definition of new class terms which depends strongly on the 
visible features taken in account. 

For any classification problem, it is of great advantage, if 
class terms are defined a priori and encyclopedic knowledge 
is available about measurable features and their typical values. 
Then models of the classes representing this knowledge can be 
constructed to serve as an essential input to a supervised clas- 
sifier (= using input knowledge as a guide). When selecting the 
features, two potential problems should be avoided: One is the 
use of well-known but hardly discriminating features, which 
will obviously not improve the classification but just increase 
the effort. The other is using features which are not well-known 
and therefore can easily cause mistakes in the classification. 
Especially, with high measurement accuracy this can lead to 
apparent unclassifiability when an object looks different than 
expected. 

Two different types of class models can be distinguished 
depending on the uniqueness of the classification answer: 

1 . In one type of models geometric rules are used to delimit 
sectors in feature space covered by the classes in compe- 
tition. These models assign just one class to the measure- 
ment uniquely and definitely, which is the one containing 
the feature vector within its geometric limits. Effectively, 
the discriminant vector does not contain likelihood val- 
ues in a statistical sense but instead a single entry ' 1 ' for 
the class decided on and zeros for the other classes (while 
nearest-neighbor classifications can define rather compli- 
cated boundary shapes in feature space, they also belong to 
this type). 

2. Another type are statistical class models rendered as like- 
lihood functions which are defined across the entire fea- 
ture space range. Only these provide discriminant vectors 
with relative likelihoods of class membership for an object, 
thereby following a "fuzzy logic" approach. 

While classes are discrete entities, a statistical classification 
can also work on continuous parameters. The discriminant vec- 
tor then becomes a likelihood function of the parameter value. 
Based on this distinction classification problems can be consid- 
ered as decision problems for discrete variables and estimation 
problems for continuous variables ( fvlelsa & Cohen 1978a ). 
In either case, a definite statistical classification containes two 
consecuti ve s teps: First, the discriminant vector is determined 
(see Sect. 2/1) and second, it is mapped either by d ecision to a 
final class or to a parameter estimate (see Sect. 23 ). 



2.2. Step 1: Determining discriminant vectors 

We assume an object with m features being measured by any 
device, thus displaying the feature vector q = (q\, . . . q m ). We 
consider n classes c\ , . . . c n as a possible nominal interpreta- 
tion and denote the likelihood of this object to belong to the 
class Cj as p(ci\q). A true member of class c, has an a-priori 
probability of displaying the features q given by p(q\ci). 

Initially, we assume a simple case of uniquely defined class 
models, where all members of a single class c, have the same 



intrinsic features q Ci , so that any spread in measured q values 
arises solely from measurement errors. Assuming a Gaussian 
error distribution for every single feature, it follows (Melsa & 
Cohen 1978b), that 



p(q\ci) = Cexp (~(q - Qc,)V k x (q - q Cl )' 



(1) 



where (q — q Ci ) is the measurement error in case the object 
does belong to a and (q — q Ci ) t is its transposed version. Each 
feature qk is measured with its own error variance a\, which 
are the diagonal elements in the variance-covariance matrix V. 
If all the features are statistically independent, the off-diagonal 
elements vanish. The normalisation factor C is 



C = 



1 



y/(^) m \Vk\ 



(2) 



As contained in the discriminant vector, the likelihood for 
an object observed with q to belong to class is then 



p(ci\q) = p{q\ci)/ ^2p(q\c 4 ) 
i=i 



(3) 



However, in realistic cases the classes themselves are ex- 
tended in feature space and their volume might have rather 
complicated shapes. In the spirit of Parzen's kernel estimator 
(Parzen 1963) the extended class q can be represented by a 



dense cloud of individual uniquely defined (point shape) mem- 
bers dj . Every member accounts for some a-priori probability 
to display q, given as p{q\cij), just as if it were a "class" on its 
own. The complete class c, is now rendered as a superposition 
of its Ni members and adds up to a total probability of 



P(<l\ci 



(4) 



In an estimation problem the probability functions have the 
same form, except for changes in the notion: 8 denotes the pa- 
rameter to be estimated, and ideally the class model q had a 
continuous shape covering the range of expected values. The 
discriminant vector would then be a function p(0\q). Again, 
the class model can be approximated by a discrete set of mem- 
bers sampling the 8 range of interest at sufficient density. 

The astronomical application discussed in this paper poses 
a mixture of decision and estimation problems which can be 
realized simultaneously with a unified approach: The decision 
may choose from the three classes c\ = stars, C2 = galaxies 
and C3 = quasars, and an estimation process takes care of the 
parameters redshift and different spectral energy distributions 
(SED). The internal structure of every class Cj is then spanned 
by its individual parameter set 9i — {9ij}, either following 
a grid design or being unsorted if no parameter structure is 
needed. 

If one chooses to approximate the spatial extension of a 
class by a dense grid sampling discrete parameter values, two 
problems are solved at once: on the one hand, an internal struc- 
ture is present for estimating parameters, and on the other hand, 
the class is well represented for calculating its total probability 
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p(q\ci). Altogether, the probability function with internal pa- 
rameters dij being resembled by class members is then 

piQlOij) = Cexp (~(q - qidaW^iq - (5) 

with the total probability for class Cj being 
P{q\d) = Y ^p(.q\9a) , (6) 

and the equation for the class likelihood function still being 

n 

P(ci\q) =p(q\ci)/^2p(q\ci) . (7) 
i=i 

Based on these probability functions the classification can 
perform a decision between object classes and estimations of 
redshift and other object parameters at once. Two different 
analyses are integrated into one paradigm and calculated ef- 
ficiently by evaluating the same probability density function. 



2.3. Step 2: Decision and estimation 

Decision rules are functions mapping a discriminant vector 
p(ci\q) to a decision value d. The value d, denotes a decision 
in favor of class c,-, i.e. the object displaying features q is then 
assumed to belong to this class. The most simple decision rule 
is the maximum likelihood (ML) scheme, which decides for the 
one class with the highest likelihood p. In case of two classes 
existing this means 



if p(ci\q) > p(c 2 \q) 
if p{ci\q) < p{c 2 \q) 



, then d\ 
, then d 2 



A more compact notion for the same rule is 



di 

> 
< 

d 2 



P(ci\q) < p{c 2 \q) 



(8) 



(9) 



Depending on the purpose of the classification tailored im- 
provements can be made to this rule. The probability of error 
(PoE) method, e.g., attempts to minimize the rate of misclas- 
sifications by including the a-priori-probability for observing a 
member of a given class. Following Bayes theorem these "pri- 
ors", denoted P{c\) and P{c 2 ), are just the relative abundance 
of the class in the whole sample. The PoE decision rule is then 



di 

p(ci| 9 )P( Cl ) > p(c 2 \q)P(c 2 ) 
d 2 



(10) 



which causes somewhat ambiguous objects to be preferen- 
tially classified as belonging to the more common class. Rare 
objects are then less likely to be found at all, but the overall per- 
formance of the classifier improves. A general approach uses 
any type of priors for trimming the classification towards spe- 
cific goals, so every decision rule compares the likelihood ratio 



A with a threshold T and follows the form (with T 
decision) 



A(g) 



P(ci\q) d > 
P(c 2 \q) < 



T 



1 for ML 



(11) 



Estimation rules are functions mapping a discriminant vec- 
tor p(9\q) to an estimated value 8. The most simple estimation 
rule is again the maximum likelihood (ML) rule, which chooses 
the one paramter value with the highest likelihood p, i.e., the 
ML estimator is given by 



p(0M L \q)>p(9\q) V0 



(12) 



The Bayesian approach can also be applied to continuous 
variables, whereas one special case is of particular interest: if 
the error distribution of the feature measurement is Gaussian, 
and if the goal is to minimize the variance of the true estimation 
error, then the optimum estimation rule can be derived analyt- 
ically (Melsa & Cohen 1978b). This minimum error variance 
(MEV) estimator is given by 



'MEV 



f9p(9\q)P{9) d9 
Jp(9\q)P(9) dO 



(13) 



and it is equivalent to interpreting the discriminant vector 
as a statistical ensemble and determining the mean of the distri- 
bution. It is also dubbed mean square estimator or conditional 
mean estimator. Note that, if p{9\q) is symmetric in 9 and uni- 
modal, the MEV estimator is identical to the ML estimator. 

2.4. Application to astronomical multi-color surveys 

Deep extragalactic surveys usually contain mostly galaxies, 
fewer stars and a tiny fraction of quasars, with relative numbers 
on the order of 100:10:1. A survey at galactic latitudes above 
\b\ il 50° with a limiting magnitude of R — 23 and an area of 
1 O , e.g., should contain roughly 30000 galaxies (Metcalfe et 
al. 1995), some 30 00 to 6000 stars ( |Bahcall & Soneira 1981 ; 
Phleps et al. 2000|), and about 400 quasars including Seyfert-1 



galaxies (Hartwick & Schade 199C ). Any classification would 



ideally be capable of distinguishing all three classes of objects. 
Only in surveys, which do not care about the rare quasars, their 
class could be dropped and the classification needed to separate 
only stars from galaxies. 

In addition to the class itself, plenty of physical parameters 
could potentially be recovered from an object's photometric 
spectrum. Most importantly, we would like to determine red- 
shift estimates for galaxies and quasars. In addition, the spec- 
tral energy distribution of galaxies contains information about 
their star formation rate and the age of their stellar populations. 
A photometric spectrum of sufficiently high spectral resolution 
can even allow to estimate the intensity of emission-lines. Fi- 
nally, the spectra of stars tell mostly their effective temperature, 
but also their metallicity and their surface gravity. 

The literature provides abundant knowledge of spectral 
properties for all three object classes. Synthetic photometry 
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can use published spectra together with efficiency curves from 
the survey filter set in order to obtain predicted colors of ob- 
jects. Sometimes, model assumptions are needed to fill in data 
gaps present in the literature, which could either be gaps on 
the spectral wavelength axis or gaps on physical parameter 
ranges, e.g. star-formation rate. Eventually, systematic multi- 
color class models can be calculated from published libraries 
covering various physical parameters. These can serve for later 
comparison with observed data. Therefore, we decided to build 
a statistical classification based on published spectral libraries 
and a limited number of model assumptions (see Sect. 3). 

In a multi-color survey the dominant information gathered 
are the object fluxes in the different filters. We decided to use 
the color indices as an input to the classification rather than 
the fluxes themselves, which eliminates one dimension from 
the problem by omitting the need for any flux normalisation, 
that remains as an additional fit paramter in template fitting 
procedures. It will be shown in Sect. 2.5, that the color-based 
approach is equivalent to the flux-based one under certain con- 
straints. 

Morphological information is typically also available to 
some extent and can be included in the classification based on 
the assumption that only galaxies are capable of showing spa- 
tial extent. But this should be done carefully, since luminous 
host galaxies can render quasars as extended. Also, if the im- 
age quality varies across the observed field, the morphological 
analysis is of limited use for not clearly extended sources. 

We define the color q g -h as a magnitude difference be- 



tween the flux measurements in two filters F g and Fh'. 



Qg-h = 



m h = -2.5 log 



Fi, 



(14) 



Obviously, the color system depends on the filter set cho- 
sen and also on the flux normalisation used. As long as the 
flux errors are relatively small, the linear approximation of the 
logarithm can be used to express magnitude errors as a mi » 
opjFi, so that the error of the color is 



'Qg-h 



{a F jF a y + {a Fh /F h y . (15) 



Since the likelihoods determined for the classification de- 
pend sensitively on the colors q and their errors cr q , both val- 
ues must be carefully calibrated. If any color offset is present 
between measurement and model, the classification will go 
wrong systematically. If errors are underestimated, the like- 
lihood function could focus on a wrong interpretation, rather 
than including the full range of likely ones. Overestimated er- 
rors will obviously diffuse the likelihoods and give away focus 
which is originally present in the data. The approximation of 
errors as presented will only work well with flux detections of 
at least 5<r to 10a, but at lower levels the classification is likely 
to fail anyway, so we ignore this concern. 

Given q and cr q a measured object is represented by a Gaus- 
sian error distribution rather than a single color vector. If col- 
ors are measured very accurately and the object is rendered as 
a narrow distribution, it could possibly fall between two grid 



steps of a discrete class model and "get lost" for the classifica- 
tion. In this case low likelihoods would be derived despite the 
spatial proximity of object and model in terms of metric dis- 
tance. The likelihood function would appear not much differ- 
ent from that of a truely strange object residing off the class in 
an otherwise empty region of color space. In technical terms, 
the classification would violate the sampling theorem (Jahne 
1991), and the probability functions would not be invertible 
any more. 

For discrete class models the sampling theorem requires 
that every measurement falling inside the volume of a model 
should "see" at least two model members inside of its Gaussian 
core. Due to practical limitations of computing time and stor- 
age space, it does not make sense to develop discrete models 
with virtually infinite density accounting for arbitrarily sharp 
measurements. Also, for measurements with low photon noise 
the dominant source of error will be the limited accuracy of the 
color calibration. 

The solution to the problem is then to design the discrete 
model with the achievable measurement accuracy in mind, and 
to smooth the discrete model into a continuous entity by con- 
volving its grid with a continuous function that is wide enough 
to prevent residual low-density holes between the grid points. 
A sensible smoothing width would just fulfill the sampling the- 
orem, i.e. the smoothing function should roughly stretch over a 
couple of discrete points. As a result, even an extremely sharp 
measurement will be covered by the model and classified cor- 
rectly. 

Higher resolution would only increase the computational 
efforts while lower resolution would ignore information which 
is present in the data and therefore potentially worsen the clas- 
sification. From a different point of view, one could leave the 
discrete model unchanged and claim the data to have larger 
effective errors by including the calibration errors thereby lim- 
iting the width of the Gaussian data representation to a lower 
threshold, which will always ensure the sampling theorem on 
the discrete grid anyway. 

Both approaches are mathematically identical, if one 
chooses to represent the calibration errors as well as the 
smoothing function by a Gaussian. Due to the symmetry of the 
Gaussian function, convolving the discrete grid or convolving 
the error distribution of the data yields the same result. The 
choice of the Gaussian is computationally very efficient, be- 
cause the convolution of the Gaussian measurement with the 
Gaussian calibration error results in another Gaussian of en- 



larged width. As mentioned in Sect. 5.1 and discussed in paper 
II, a survey in the visual bands can be calibrated with a rela- 
tive accuracy on the order of 3% between the different filters. 
Therefore, we decide to apply a 0'. n 03-Gaussian as a smoothing 
function. 

In summary, we apply the formalism presented in Sect. 12 
in the following way: the errors a qi of the colors qi are con- 
volved with the smoothing 0'. U 03-Gaussian and as a result the 
effective errors are 



a 2 + ((F03) 2 



(16) 
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For simplicity, we assume the individual colors to be un- 
correlated, which is actually not true for filters sharing spectral 
regions in their transmission. The variance-covariance matrix 
then becomes diagonal 



V 



(17) 



and the probability function turns into 



C 



1 3 = 1 




(18) 



Based on the three object classes discussed the likelihood 
function is 



p(ci\q) 



p(q\ci) 



P{q\c s tars) + P{q\c gala 



-p{q\c q ' 



(19) 



uasars J 



Considering three classes implies that extremely faint ob- 
jects with large errors get average probabilities of 33% as- 
signed for all classes. In general applications, we use a decision 
rule for an object seen as q, which requires that one class is at 
least three times more probable than the other two classes put 
together, i.e.: 

If there is one class with p(ci \ q) > 0.75, then we assign 
this class to the object, but if all classes have likelihoods 
below 0.75, we call it unclassifiable. 

For the detection of unusual objects, we look at the color 
distance of an object to the nearest member of any class model 
to derive a statistical consistency with the class. The value of 
this consistency depends on the different color variances and 
can be calculated from x 2 -statistics. Lacking an analytic ex- 
pression we use x 2 -tables (Abramowitz & Stegun 1972) to 



evaluate the statistical consistency between class and object. 
In practice, the resulting \ 2 -values need to be normalised to 
a plausible scale, since the raw values obtained are enlarged 
artificially due to the discrete sampling of the library and cos- 
mic variance. We use the following operative criterion for the 
selection of unusual objects: 

If an object is inconsistent at least at a confidence level 
of 99.73% (i.e. 3a in case of a normal distribution) with 
all members of all classes, then we call it strange. 

Strange objects can formally be classifiable, if the likeli- 
hoods still prefer a certain class membership. They have either 
intrinsically different spectra without counterparts in the class 
models, or they are reduction artifacts, e.g. when neighboring 
objects affect their color determination, and this is not taken 
into proper account for the error calculation. 

Apart from the rather trivial ML estimator, we use the MEV 
estimator to obtain redshifts and SED parameters of galaxies 
and quasars. Their class models are designed as regular grids 



(see Sect. 3) with members cy residing at redshift Zij. The 
MEV estimator for the redshift is then 



MEV 



Ej%P(g| c ij) 

YtiPiqfaj) 



(20) 



It is applied to the class models for galaxies and quasars 
independently and for each class interpretation an independent 
redshift estimate is obtained. There is also an assessment for 
the likely error of the z estimate given by the variance of the 
distribution p(q \ z): 



Y,j(Zij ~ (z)MEv) 2 p(q\cij) 

EjP(9|cij) 



(21) 



This estimation scheme would be sufficient, if models had 
a rather simple shape in color space, i.e. if color space and 
model parameter space could easily be mapped onto each other. 
In fact, the class model for galaxies and particularly the one 
for quasars can have very complicated folded shapes in color 
space, so that the distribution p(q\z) can have a correspond- 
ingly complicated structure that is not at all well described by 
mean and variance. 

Therefore, we distinguish three cases: unimodal (single 
peaked), bimodal (double peaked) and broad distributions. In 
unimodal cases (z)mev an d °z are appropriate reductions of 
p(q\z). In bimodal cases we split the redshift axis in two inter- 
vals delimited at (z)mev and obtain two alternative unimodal 
solutions with relative probabilities given by the p sums in the 
two intervals. If the distribution is so broad, that it starts to re- 
semble a uniform distribution, (z)mev approaches the mean z 
value of the model and a z approaches y/l/12(z max — z m i n ). 
In order to keep our statistics clean from such mean redshift 
contaminants, we cut off the estimator at some uncertainty: 

If an object has a z > l/8(z max — z„ lin ), then we ignore 
the MEV estimate and call its redshift uncertain. 

In particular, it is possible, that an object has a bimodal 
distribution with one peak (result) and one broad (uncertain) 
component. In the following, we denote this extended scheme 
of MEV estimates accounting for possible bimodalities as our 
MEV+ estimate. In Sect. 4.4 we will compare the performance 
of all three estimators, ML versus MEV and MEV+. 

An effort was made to implement a classification code opti- 
mized for short computing time. The use of precalculated class 
models eliminates any synthetic photometry from a typical fit- 
ting procedure. Furthermore, the use of colors instead of fluxes 
eliminates the need for finding a flux normalisation. In terms of 
CPU time, the classification of one object contains mainly the 
calculation of the probability p(q\cij) for every class member, 
which involves first adding up all of -scaled squared color dif- 
ferences and second evaluating an exponential function of the 
resulting sum that is already a measure of strangeness. Sum- 
ming up the p{q\cij) to obtain class likelihoods and deriving 
mean and variance of the internal class parameters should take 
less time than calculating the probability density function, if 
more then ten color axes are taken into account. With class 
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models containing about 50000 members and 13 colors, the full 
classification of one object takes about 0.3 sec when running on 
a 200 MHz Ultra Sparc CPU inside a SUN workstation. Since 
different survey applications might require different sample se- 
lection schemes, we decided to calculate and store discriminant 
vectors for all objects and select subcatalogs for further analy- 
sis later. 

2.5. Equivalence of flux-based and color-based classification 

We now show, that the color-based classification yields the 
same best fit as a flux based template fitting algorithm. Lanzetta 
et al. (1996), e.g., calculate a likelihood function depending on 
redshift z, a spectral energy distribution and a flux normalisa- 
tion parameter A, following the form: 



J model 



exp 




k,obs 



-AF, 



k.niodel 



°F k 



(22) 



Basically, the likelihood determination relies on the 
squared photometric distance d between observation and 
model, resulting from the flux differences AFk in each filter: 



E 



Xk 



with 



F, 



Xk = 



k,obs 



F 



k,model 



a F k 



AF k 



(23) 



In the color based approach there are n — 1 color indices 
contributing distance components and we assume the single 
constraint, that there is one particular base filter approximately 
free of flux errors, e.g. a deeply exposed broad-band filter. The 
color indices are made by comparing any filter to this base fil- 
ter ensuring optimum errors for the colors. In this scheme, any 
errors in the relative calibration are absorbed into the color in- 
dices. Therefore, it is very important, that the base filter is not 
wrongly calibrated with respect to the other wavebands, since 
the error would spread into the entire vector of color indices. 

We then look only at a range of good fits, and do not mind 
rather crude ^-approximations for relatively bad fits which are 
anyway ruled out as solutions. Also, we consider only mea- 
surements with o~F k /Fk 0.2, which allows the assumption 
of Gaussian color errors and a linear approximation of the log- 
arithm. The distance components are: 

(m k - m base ) obs - (m k - W^base) model 



Xk 



&rn i-—r 



^ ^°9(Fk/Fbase)obs — log(Fk/Fb ase ) rno del 
\/ ( a F k /Fk) 2 + (& F base / Fbase) 2 

Using the terms AFk and <JF lase /Fb aS e ~ 0, we obtain 



X k w2.5— • 



log [I 



AF k 



- log 1 



AFi 



base 



(25) 



Expanding the logarithm for AFk/Fk <C 1, we get 



Xk 



°~F k [Fk 



odd 



Fbas 



ode! 



Fk 



a F k 



°F k F t 



k.rnodel 



<j Fk F t 



(26) 



base,model 



The first term is typically on the order of 1, while the sec- 
ond term is on the order of ap k jF k <C 1 and the third one of 
o~F Baae /o~F k *C 1. Therefore, the last two terms can be dropped 
and the expression for \k reduces to 



Xk 



AF k 

°F k 



(27) 



which is identical to the expression used in the flux tem- 
plate fitting method shown in Eq. ^3[ 

2.6. System of color indices 

In the previous section, we had discussed the relevance of a 
common base filter for the various color indices, which is sup- 
posed to have relatively small flux errors in order to keep the 
color errors as low as possible. Our multiband survey applica- 
tions usually involve a smaller number of broad bands as well 
as a larger number of medium-band observations. For these, 
we decided to form color indices from broad bands neigboring 
on the wavelength axis, i.e. U-B, B-V, V-R and R-I, which we 
assume to be the optimum solution for comparably deep bands. 
Each of the shallower medium bands we combine with the most 
nearby broad-band in terms of wavelength, which then serves 
as a base filter for the medium-band color indices, e.g. B^-86 
or 605-R, where letters denote broad bands and numbers rep- 
resent the central wavelength of medium-band filters measured 
in nanometers. 

In terms of flux template fitting, this scheme of color in- 
dices means, that we use a few deep broad bands to fit the 
global shape of the SED, and then use a few groups of medium 
bands around each deep broad-band to fit the smaller-scale 
shape locally. The x 2 -values of the global fit and the several 
local fits are then just added up to the total \ 2 ■ This scheme has 
a particular advantage over a solely global flux fitting: the local 
fits can well trace spectral structures, even if the global distri- 
bution of the object differs from the template (e.g. as it could 
be caused by extinction). Therefore, it is not too dependent on 
accurate global template shapes and it can use the ability of 
the medium bands to discriminate narrow spectral features for 
a more accurate classification. Of course, this advantage van- 
ishes immediately for a pure broad-band survey, where local 
structures in the spectrum are not traced, and therefore no local 
fits are available for the x 2 -sum. 

3. The classification libraries 

We assembled the color libraries from intrinsic object spec- 
tra assuming no galactic reddening. Clearly these libraries can 
only be sufficient when observing fields with low extinction 
and little reddening. Usually, such fields are chosen for deep 
extragalactic surveys and the CADIS fields in paticular were 
carefully selected to show virtually no IRAS 100/i flux (be- 
low 2 MJy/sterad), so we expect "zero" extinction and redden- 
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Fig. 1. This diagram shows a few selected spectra from our template libraries. The shown wavelength scale runs from 315 nm to 
1000 nm for stars (left), from 125 nm to 1600 nm for galaxies (center) and from 100 nm to 550 nm for quasars (right). The flux is 
A/a in units of photons per nm, time intervall and sensitive area and offset by one unit per step within a class. The flux scale is 
normalised to unity at 800 nm for stars, arbitrary for galaxies, and normalised to 0.2 at 250 nm for quasars. The stellar templates 
are taken from Pickles (1998), the galaxy templates from Kinney et al. (1996) and quasar templates are modelled after Francis 
et al. (1991). The quasar diagram shows nine spectra with three different spectral indices (-2.0, -0.6, +0.8) and three different 
relative emission-line intensities (0.6, 2.1, 5.7). 



ing there. When applying this color classification to fields with 
reddening, the libraries would have to be changed accordingly. 

Obviously, the libraries should contain a representative va- 
riety of objects, but still they can never be assumed to cover a 
complete class including all imaginable oddities. When classes 
are enlarged to cover as many odd members as possible, there 
is a trade-off to be expected between classifying the odd ones 
right, and introducing more spatial overlap between the classes 
in general, i.e. introducing more confusion among normal ob- 
jects. The spectral libraries we employ are partly based on 
observations only and partly mixed with model assumptions. 
Our particular choice of libraries is founded on experience we 



gained within the CADIS survey, where we found several other 
published templates to be less useful. 



3.1. The star library 

For the stars, we picked the spectral atlas of Pickles (1998), that 
contains 131 stars with spectral types ranging from 05 to M8. 
It covers different luminosity classes but concentrates on main 
sequence stars, and it also contains some spectra for particu- 
larly rich metallicities. For the surveys in consideration, very 
young and very luminous stars should not be expected, but we 
include the entire library nevertheless (see Fig. Stars later 
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than M8 are missing in the library, but they do show up in deep 
surveys like CADIS (Wolf et al. 1998). These objects are inter- 



esting on their own, of course, but they are so rare, that a couple 
of misclassifications do not hurt the statistics on other objects. 

In earlier stages of the CADIS survey, we reported using the 
Gunn & Stryker (1983) atlas of stellar spectra (see e.g. Wolf 
et al. 1999), which has a number of disadvantages compared 
to the new work by Pickles. Our impression is that the Pick- 
les spectra have a better calibration in the far-red wavelength 
range and are less affected by noise there. Especially, broad 
absorption troughs in M stars are rendered more accurately in 
the Pickles templates, which can be quite relevant for medium- 
band surveys. Also, they cover the NIR region and, e.g., the 
entire CADIS filter set all the way out to the K' band, thereby 
omitting the need for homemade extrapolations. Since it con- 
tains two different metallicity regimes, it covers the range of 
possible stellar medium-band colors better than the Gunn & 
Stryker atlas, most notably among M stars for colors sensitive 
to their deep absorption features and, e.g., among K stars for 
colors probing the Mg I absorption. 

The atlas is not structured as a regular grid in the stellar pa- 
rameters and we consider the resulting color library an unsorted 
set without internal structure. If variations in dust reddening are 
to be expected within the field as in the case of Galactic stel- 
lar observations, this effect should be treated as an additional 
parameter in the library. 

For multi-color surveys aiming specifically at Galactic 
stars, one would ideally like to have a library organized as a 
regular grid in effective temperature, surface gravity and metal- 
licity, which could, e.g., be derived from model atmospheres. 
Such a fine classification is not needed for extragalactic sur- 
veys, where the focus is on galaxies and quasars. We gained 
some experience with the stellar spectra from the model grid 
by Allard (1996), but we decided not to use it, since the overall 
colors seemed to be better matched by the Pickles library. 

3.2. The galaxy library 

The galaxy library is based on the template spectra by Kin- 
ney et al. (1996). These are ten SEDs averaged from integrated 
spectra of local galaxies ranging in wavelength from 125 nm to 
1000 nm. The input spectra of quiescent galaxies were sorted 
by morphology beforehand to result in four templates called 
E, SO, Sa and Sb. The starburst galaxies were sorted by color 
into six groups yielding six more templates called SB6 to SB1. 
Based on the observation, that color and morphology of galax- 
ies correlate, this template design seems reasonable. This way 
the classification can indirectly measure morphology of galax- 
ies via their SED, at least as far as the locally determined color- 
morphology relation holds at higher redshift. 

The templates contain a very deep unidentified absorption 
feature around 540 nm, which we supposed to be an artifact of 
the data reduction and eliminated. We left the abundant struc- 
tures in the UV unchanged, although some of them might be 
noise and we do not know how to interprete them. We modelled 
a near-infrared addition heuristically by a simple law consis- 



tent with the / — X'-colors of a sample of galaxies with known 
spectroscopic redshifts (see paper II). Using this addition, we 
extended the spectra out to 2500 nm, and actually replaced the 
spectrum starting from 800 nm to eliminate the noise in the 
templates redwards of 800 nm (see Fig. |l]). Quiescent galaxies 



while starburst galaxies 

-1/3 



were extended according to /„ 
seemed most consistent with an extension of j v 

We consider the templates to form a one-dimensional SED 
axis of increasingly blue galaxies and fill in more templates to 
obtain a dense grid of 100 SEDs. Our interpolation is done lin- 
early in color space, and the number of filled-in SEDs is chosen 
such, that the color space is filled rather uniformly. The new 
SEDs are denominated as numbers from to 99, where the ten 
original SEDs used for the interpolation reside at the following 
numbers: 

E - SO - Sa - Sb - S6 - S5 - S4 - S3 - S2 - SI 
0- 15 -30-45-75-80- 85-90-95 -99 

Internal reddening is considered an important effect for the 
colors of galaxies and especially common among later types. 
While trying to account for it, we realized that its effect is 
merely one of shifting the zeropoint in the SED and hardly one 
of changing the redshift estimates. If we did introduce an inde- 
pendent reddening parameter, it would be almost colinear with 
the SED axis itself. Therefore, we opted for using the templates 
as determined from real galaxies and provided by Kinney et al. 
(1996), since they probably contain already a typical distribu- 
tion of reddened objects. Due to our scheme of SED interpola- 
tion, we can still classify galaxies, which are reddened more or 
less than usual. 

We also tried to change the SED interpolation scheme by 
relocating the templates to different SED numbers, which did 
not seem to improve the results. The color library was calcu- 
lated for 201 redshifts ranging in steps of Az = 0.01 from 
z = to z = 2, finally containing 201 x 100 members. We 
did not intend to go beyond a redshift of 2, since our survey ap- 
plications have typically not become deep enough, yet, to see 
such objects in useful numbers. 

The main shortcoming of this library is that the 1- 
dimensional SED allows no variation in emission-line ratios 
independent of the global galaxy color. Since medium-band fil- 
ters can contain strong emission-line signals from faint galax- 
ies, an observed emission-line ratio detected by two suitably lo- 
cated filters can be in disagreement with the global SED traced 
by all other filters. Since especially the CADIS filters are placed 
to deliver multiple detections of emission lines at several se- 
lected redshifts, some degradation in real performance could 
be expected with respect to the simulation (see paper II). 

3.3. The quasar library 

The quasar library is designed as a three-component model: 
We add a power-law continuum with an emission-line con- 
tour based on the template spectrum by Francis et al. (1991), 
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Fig. 2. The quasar library is based on an emission line con- 
tour taken from the quasar template spectrum by Francis et al. 
(1991). The wavelength scale runs from 100 nm to 550 nm and 
the flux is A/a in units of photons per nm, time intervall and 
sensitive area (arbitrary units). 





1 1 1 1 1 1 1 1 1 


l l l | l l 


'I'- 




3.5 














4.3S 








S.O ^ 

//.■' 
^ ■/■ ' 








- J-i- -i- a - I -i , , 1 


1 1 1 1 1 1 


ll, 1 



60 80 100 120 140 

lambda (nm) 

Fig. 3. For the quasars we assumed a throughput function for 
the Lyman-a forest which we derived from a visual inspection 
of quasar spectra published by Storrie-Lombardi et al. (1996). 
The scale of this function depends on redshift and is shown for 
z = 3.5, 4.25 and 5.0. 



and then apply a throughput function accounting for absorp- 
tion bluewards of the Lyman-a line. We modeled a throughput 
function To after visually inspecting spectra of z w 4-quasars 
published by Storrie-Lombardi et al. (1996), and keep its shape 
constant (see Fig.|[) while varying its scale to follow the in- 
creasing continuum depression Da towards high redshift. Us- 
ing data from Kennefick (1996) and Storrie-Lombardi et al. 
(1996) as a guideline, we arrived at 

T(z) = T (z/4 - 25)2 . (28) 

The intensity of the emission-line contour was varied only 
globally, i.e. with no intensity dispersion among the lines. As 
long as typically only one medium-band filter is brightened by 
a prominent emission line, the missing dispersion should not 
affect the classification (see Fig. ||). For the intensity factor rel- 
ative to the template, e e , ten values were adopted ranging in 
steps of Ae = 0.25 from e ~ —0.5 to e — 1.75 on a logarithmic 
scale, which is roughly 0.6 times to 5.7 times the template in- 
tensity. Originally, we tried a range from 0.3 times to 2.7 times, 
but the first twenty quasars found in CADIS contained mostly 
strong lines, which are better represented by the current limits. 

The slope of the power-law continuum /„ ~ v a was varied 
in 15 steps of Aa = 0.2 ranging from a — —2.0 to a = 0.8. 
The library was calculated for 301 redshifts ranging in steps 
of Az = 0.02 from z = to z = 6, finally containing 301 x 
15 x 10 = 45150 members. As a future improvement one could 
imagine the inclusion of Seyfert I galaxies with nuclei of rather 
low luminosity, i.e. spectra coadded as a superposition of a host 
galaxy spectrum with a broad-line spectrum for the nucleus. 

3.4. Calculation of color libraries 

As a first step, the spectral libraries were transformed into color 
index libraries representing precisely the set of filters and in- 
struments in use. The use of precalculated filter measurements 
rather than fully resolved flux spectra removes any computa- 
tionally expensive calculations for synthetic photometry from 
the process of classifying the object list. The use of color in- 
dices omits the needs for any flux normalisation, further speed- 
ing up the classification. A list of ~ 10 4 objects and ~ 10 



colors can be classified within a couple of hours on a SUN En- 
terprise II workstation even when using ~ 10 5 templates. 

For best results it is required that the color libraries are cal- 
culated for an instrumental setup resembling precisely the ob- 
served one, i. e. the synthetic photometry calculation has to take 
every dispersive effect into account. We decided to use photon 
flux colors derived from the observable object fluxes, averaged 
over the total system efficiency of each filter and assuming an 
average atmospheric extinction. 

The shape of the filter transmission curves needs to be 
known precisely, and is in the best case measured within the 
imaging instrument itself under conditions identical to the real 
imaging application. This is easily possible with, e. g., the 
Calar Alto Faint Object Spectrograph (CAFOS) at the 2.2m 
telescope on Calar Alto, Spain: in this instrument light from an 
internal continuum source is sent first through the filterwheel 
and second through the grism wheel before reaching the detec- 
tor. Images are taken with and without the filter, so their ratio 
gives immmediately the transmission curve. Colors measured 
in narrow filters depend sensitively on the transmission curve, 
whenever strong spectral features are probed, e.g. the contin- 
uum drop at the Ca H/K absorption or the Mg I absorption 
in late-type stars. In these cases the curve needs to be known 
rather precisely, since otherwise the calibration would be off, 
and misclassifications could occur. 

3.5. Potential improvements on the classification 

The quality of the classification reached depends on just the 
three elements of the method: the quality of the measured data, 
the choice of the classifier and the quality of the libraries form- 
ing the knowledge database for the comparison. In principle, 
improvements on the performance can be achieved only in the 
following respects: 

- improvements on the data: the filter set could be changed 
and potentially be tailored to a specific goal of the survey; 
the exposure time could be increased or distributed better 
among the individual filters; the accuracy of the calibration 
could be increased; 
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V-R V-R 

Fig. 4. These diagrams of V-R vs. R-I color show the class models of stars (black) and galaxies (grey) on the left, and stars 
(black) and quasars (grey) on the right to illustrate their location in color space. The colors plotted are photon flux color indices, 
which are offset compared to astronomical magnitudes, such that Vega has V — R = —0.41 and R — I = —0.61. 



improvements on the classifier: the classifier can not be im- 
proved fundamentally. A very crucial ingredient for a statis- 
tically correct result is a valid assessment of the measure- 
ment errors since they form a basic input to the probabil- 
ity calculation. Some simplifications have been introduced 
which make a difference only among faint objects which 
are hardly classifiable anyway. For specific goals the clas- 
sifier can be modified accordingly, and diverse goals poten- 
tially ask for contrary strategies. A global maximum in the 
classification reliability is best achieved by weighting rare 
classes lower, while searching for rare objects would profit 
from weighting them higher. 

improvements on the libraries: this is the most important 
aspect, since any library-based classifier can obviously re- 
cover identifications only if they are contained in the li- 
brary, and if their spectral profile is well-known. Alto- 
gether, this work has strongly benefitted from templates 



and libraries published in the literature (Francis et al. 1991 



Kinney et al. 1996; Pickles 1998 ), which we could arrange 



into an ordered database. Their limitations are discussed 
in the respective section. Our present experience suggests 
that empirical spectra work better for our survey data than 
purely theoretical spectra. In the future, we would also like 
to check some of our model assumptions used in the li- 
braries against observations. 

4. Simulation of competitive filter sets 

Initially, it should be natural to assume that surveys with dif- 
ferent filter sets show quite a different performance in terms of 
classification and redshift estimation. If a survey aims for ob- 
jects with very particular spectra, the filter set can certainly be 
tailored to this purpose. If the objects of interest span a whole 
range of spectral characteristics, it is not trivial to guess via 
analytic thinking which filter set performs best. 

Originally, this method was developed for CADIS using 
real CADIS data to test it. Then, we intended to optimize it 
and try to draw conclusions about survey strategies. Aiming 
for more insight into the question of filter choice, we performed 
Monte-Carlo simulations on different model surveys by feed- 



ing simulated multi-color observations of stars, galaxies and 
quasars into our algorithm. Here, we present a comparison of 
three fundamentally different filter sets and show their resulting 
performance for classification and redshift estimation. 

The three model surveys spend the same total amount of 
exposure time on different filter sets, but use the same instru- 
ment, telescope and observing site. We chose the Wide Field 
Imager (WFI) at the 2.2-m-MPG/ESO-telescope on La Silla as 
a testing ground, because it provides a unique, extensive set of 
filters ranging from several broad bands to a few dozen medium 
bands to choose from. Furthermore, the WFI is a designated 
survey instrument which is extensively used by the astronomi- 
cal community. 

4.1. Filtersets and exposure times 

The three modelled surveys, here called setup "A", "B" and 
"C", each spend 150ksec of exposure time distributed on the 
following filters (see also Tab. |l]): 

Setup A spends 50ksec on the five broad-band filters of 
the WFI (UBVRI) and lOOksec on twelve medium-band filters. 
Using ESO's exposure time calculator V2.3.1 for the WFI, we 
related exposure times to limiting magnitudes assuming a see- 
ing of l'/4, an airmass of 1.2, point source photometry and a 
night sky illuminated by a moon three days old. The exposure 
times are distributed such, that a quasar with a power-law con- 
tinuum f u = v a and a spectral index of a = —0.6 is observed 
with a uniform signal-to-noise ratio in all medium bands. As a 
result, the twelve medium bands each deliver a lO-er detection 
of an R — 23.0-quasar. 

Setup B spends 50 ksec on the same broad bands but con- 
centrates the 100 ksec for medium-band work on only six filters 
reaching a uniform 10-rr detection of a R = 23.38-quasarthen. 

Setup C finally spends all 150 ksec on the broad-band filters 
and omits the medium bands entirely. 



In Sect. 4.2 and Sect. 4.3 we present the performance results 



for setup A, which has actually been used for a recent multi- 



color survey (Wolf et al. 2000b). T he r elative performance of 
the three setups is compared in Sect. 4^5 In Sect. 0we attempt 
to derive some basic analytic conclusions. 
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Fig. 5. Monte-Carlo simulation for the classification of stars, galaxies and quasars with setup A and R = 22 ... 25. The prob- 
ability for a simulated object to be assigned to its original class is plotted over the color B — V for stars and over the redshift 
for galaxies and quasars, where B — V is an astronomical magnitude. In case of the galaxies black dots denote quiescent galax- 
ies (SED<60) and grey dots are starburst systems (SED>60). For bright objects the performance is limited by a systematic 
uncertainty of 3% assumed as a minimum error for the color indices. 



The simulations are carried out by creating a list of test ob- 
jects from the color libraries presented in Sect. 3. We assume 
a certain R-band magnitude and calculate the individual filter 
fluxes and corresponding errors for each object. Then we scat- 
ter the flux values of the objects according to a normal distribu- 
tion of the flux errors. Finally, we recalculate the resulting color 
indices and index errors and use this object list as an input to 
the classification. 

For the stars we use just 131 test objects as there are mem- 
bers in the library. For the test galaxies we take only every third 
member of the present library giving us 6700 objects. From the 
quasar library we use every seventh object resulting in 6450 
quasars per test run. 

These simulations show us how well the classification can 
possibly work, assuming that real objects will precisely mimic 
the library objects. Every real situation will contain differences 
between SED models and SED reality, sometimes called "cos- 



mic variance", which will worsen the performance of every real 
application. Nevertheless, the simulation highlights the princi- 
pal shortcomings of the method itself and the chosen filter set 
in particular. Therefore, it can be used to judge the relative per- 
formance of different filter sets. 

We run these tests for stars, galaxies and quasars with mag- 
nitudes of R =22, 23, 24 and 25, respectively, in order to see 
how the classification performance degrades from optimum to 
useless with decreasing object flux. Given that R = 23 corre- 
sponds roughly to the 10-ct limit of setup A, the most shallow 
survey, we expect that the classification has almost reached its 
best performance at R = 22. This is due to our assumption of a 
3% uncertainty in the calibration, which causes even the bright- 
est objects with the best photon statistics to perform not much 
better than an object detected only at a 30-ct level. Finally, at 
R = 25 objects are well detected only in the broad-band filters, 
while the medium bands yield only fluxes with errors higher 
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Table 1. Filters and 10-er-magnitude limits for the three survey 
setups compared with Monte-Carlo simulations. The I-band fil- 
ter is a long wavelength passband filter with a cut-on wave- 
length roughly at 780 nm. Its far-red sensitivity limit is given 
by the dropping quantum efficiency of the CCDs. All filters are 
installed in the Widefield Imager at the 2.2m-MPG/ESO tele- 
scope at La Silla observatory. 
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Table 2. Classification matrix for objects of R = 23 in setups 
A and C as derived from Monte-Carlo simulations. An input 
vector containing a true number distribution of objects among 
the three object classes would be mapped by this matrix onto 
a classified distribution among four classes. Numbers below 
0.005 are left blank. 
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than 40%. We expect the surveys to be almost useless at this 
level. 
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Fig. 6. Monte-Carlo simulations for the multi-color redshifts of 
galaxies and quasars with R = 22 ... 25 according to the Max- 
imum Likelihood estimate in setup A. In case of the galaxies 
black dots denote quiescent galaxies and grey dots are star- 
burst systems. This diagram shows the redshift estimates for 
all galaxies, however they were classified, but only for those 
quasars passing the classification limit of 75%. 



4.2. Classification performance for setup A 

We now look at the classification performance as achieved in 
setup A, the model survey with the highest number of filters, 
but the shallowest exposures in terms of photon flux detection: 
For R = 22 it turned out, that the classification works al- 
most perfect (see uppermost row of diagrams in Fig. |5j). Gener- 
ally, more than 99% of all test objects in any class are correctly 
classified. 

At R = 23, usually less than 5% of all objects in any 
class get lost to unclassifiability. Most affected with 10% in- 
completeness are quasars at z < 2.5 with red spectra and weak 



emission lines. In this simulation, their location in color space 
overlaps with starburst galaxies at redshift 1.6 < z < 2.0. So 
far, our galaxy templates contain no information in the spectral 
range bluewards of the Lyman-a line leaving their U-band flux 
blank in this redshift range. As a result, the classification omits 
this band for the comparison with the library galaxies. 

At R = 24, about one third of the stars get lost. These 
are mostly yellow stars which are too faint in every filter to 
be classified unambiguously. Rather blue and rather red stars 
are still successfully classified, because either on the blue or 
on the far-red side of the filter set they still show significant 
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fluxes and sufficiently accurate color indices. About a quarter 
of the galaxies would be missed, which are either blue galaxies 
not showing strong continuum features or red galaxies at red- 
shifts low enough to render them faint in the far-red filters, too. 
Also, a quarter of the quasars is lost, either red z < 2.5-quasars 
overlapping again with starburst galaxies at 1.6 < z < 2.0, 
or z > 2.5-quasars with weak emission lines overlapping with 
early-type galaxies at z < 0.4. 

At R = 25, the classification has finally become highly 
incomplete, but can still find very blue stars and very red ex- 
tragalactic objects like quiescent galaxies and quasars at higher 
redshift (see bottom row in Fig.||, see also Fig.^J for precise 
numbers). 

In all simulations, most incorrectly classified objects are 
unclassifiable and a minority of them are scattered into another 
class (see also classification matrix, Tab.^). Especially, quasars 
seem to be not strongly contaminated by false candidates. At 
any magnitude in any setup, less than 1% of the galaxies are 
scattered into the quasar candidates except for setup C at R ! = 
25. Still, this contamination in the quasar class is not negligible, 
since a minor fraction of a rich class can be a large number 
in comparison with a poor class. In CADIS we found about 
3% of the extragalactic objects at R < 23 to be quasars. A 
contamination of less than 1 % means that less than a quarter of 
the quasar candidates should be galaxies. 

4.3. Multi-color redshifts in setup A 

Fig. ^displays the comparison of the photometric MEV+ red- 
shift estimates in setup A with the original true redshifts of the 
simulated objects. At R — 22 (see uppermost row of Fig.^) 
the redshifts work quite satisfactorily for galaxies and quasars, 
which is demonstrated by nearly all objects residing on the di- 
agonal of identity. 

Towards fainter magnitudes, the galaxy redshifts degrade 
first at both the lower and the higher redshift ends. The deepest 
working magnitudes are reached in the redshift range of 0.5 < 
z < 1. This feature is due to the location of the 4000 A-break: 
When the break is located in the central wavelength region of 
the filter set, many filters are available on either side of the 
break to constrain its location rather well even for noisy data. 
For z — 0.15 ... 1.15, the 4000 A-break is at least enclosed 
by mediumband filters. But if the break is located close to the 
edge of the filter set and, e.g., detected only by a noisy signal 
from a single filter, the true redshift interpretation can not be 
distinguished well from other options. 

Quiescent galaxies still work reasonably fine at the higher 
redshift end, because they are brighter in the far-red filters. 
Starburst galaxies mostly degrade at higher redshift, because 
they have less discriminating (and trustworthily known) fea- 
tures in the UV than in the visual wavelength range. 

The quasar redshifts remain rather precise at z = 
2.2 . . . 6.0, all the way down to R — 25. This is the redshift 
range, where the continuum step over the Lyman-a line can 
be seen by the filter set and redshift estimates are expected 
to reach deep. Of course, at z <; 4 the R-band magnitude of 
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Fig. 7. Monte-Carlo simulations for multi-color redshifts of 
galaxies and quasars with setup A and R = 23 according to the 
estimators Maximum Likelihood (ML), Minimum Error Vari- 
ance (MEV) and our advanced MEV with better handling of 
bimodalities (MEV+). In case of the galaxies black dots denote 
quiescent galaxies and grey dots are starburst systems. Shown 
are all galaxies, but quasars only if they passed the classifi- 
cation limit of 75%. It seems that ML and MEV+ are almost 
equivalent for quasars, while for galaxies MEV and MEV+ 
make no visible difference. Objects considered uncertain by the 
MEV estimator do not get an MEV estimate assigned, but they 
receive an ML estimate that can potentially be wrong. 



quasars appears artificially faint, since it is strongly attenuated 
by the Lyman-a forest, but the redder filters contain higher flux 
levels sufficient to constrain the location of the continuum step. 
Redshift confusion arises first in the low-redshift region work- 
ing its way up to higher redshifts with decreasing brightness. 
At z < 2.2 the continuum shows no Lyman-a forest in our fil- 
ter sets, but only a redshift invariant power-law shape. In this 
case, the multi-color redshifts rely solely on some emission- 
lines showing up in the medium bands. 

Some concentrated linear structures are visible off the di- 
agonal at lower redshift with the best contrast at R — 24. Their 
origin is a misidentification of weak emission lines: There are 
two structures mirrored at the diagonal following the linear re- 
lations (1 4- Zp hot )/(l + z) w 1.74 and (1 + z)/(l + 

Zphot ) ~ 
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Fig. 8. Distribution of true redshift estimation error (Az = 
Zmc—z) among simulated galaxies with R — 23 in setup A sep- 
arated for quiescent and starburst objects. The solid line shows 
results for the Minimum Error Variance estimators (MEV and 
MEV+ are virtually the same) and the grey line those for the 
Maximum Likelihood estimator (ML). Starburst systems show 
higher errors and some large mistakes with Az > 0.1. 



As shown in Fig. [7], the different estimators deliver rather 
comparable results with quite similar redshift accuracy. In case 
of the quasars the improved MEV+ method (which can detect 
bimodal probability distributions) performs different from the 
standard MEV method but rather similar to the ML method. 
This is due to bimodalities where the MEV estimate is a 
weighted average of the two present probability peaks, while 
the MEV+ estimate decides for the single peak containing the 
higher probability integral, which is likely to be roughly coin- 
cident with the ML estimate pointing at the redshift with the 
highest individual probability. Bimodalities can again be seen 
as linear structures off the main diagonal and arise from con- 
fusion among emission lines. In case of the pure MEV method 
the peak associated with the wrong solution is averaged with 
the correct solution residing on the diagonal, and the MEV plot 
shows smeared out structures around the diagonal rather than 
the linear ones like the ML or MEV+ plots. 



1.74. They are caused by a confusion of the Mgll line with 
the Hf3 line. Another structure at (1 + Zp/jot)/(l + z) ~ 1.25 
arises from weak Lyman-a lines of very blue quasars which are 
interpreted as C IV lines, or weak C IV lines which are taken 
to be CIII lines. The extent of these structures across the dia- 
gram obviously depends on the visibility of the involved lines 
within the medium-band filter set. Finally, there is a large group 
of quasars estimated to be at nearly zero redshift, but truely 
strechting even beyond z = 3. These are among the quasars 
with the lowest emission line intensities in the library, which 
basically display only their redshift-invariant power-law con- 
tinuum in the filters. 

4.4. Maximum Likelihood redshift versus Minimum Error 
Variance redshift 

We now compare the relative performance of three different 
redshift estimators using the example of galaxies and quasars 
at a fixed magnitude of R = 23. We have used the Maximum 
Likelihood (ML) method, the Minimum Error Variance (MEV) 
method and an advanced MEV method (MEV+) as we defined 



it in Sect.|2.4. 

While the Maximum Likelihood (ML) method always 
gives a redshift estimate, the Minimum Error Variance (MEV) 
method does not in the way we use it. Some objects have 
probability distributions which are close to flat yielding a red- 
shift estimate that reflects primarily the redshift interval cho- 
sen for the template library rather than giving a reliable physi- 
cal interpretation of the object. We do not assign any estimate 
to these uncertain objects (as we defined them in Sect. 2.4), 
which is justified with their estimates being senseless anyway. 
A caveat for a direct performance comparison is the fact, that 
the MEV/MEV+ methods ignore the uncertain objects, whose 
selection function is redshift-dependent at the faint end and 
could furthermore be different in a real dataset due to cosmic 



4.5. The three setups in comparison 

All setups are designed to spend the same amount of exposure 
time on a survey field, but distribute it on different filter sets. 
The pure broad-band survey, setup C, collects far more photons 
than the setups A and B, which are mainly exposing medium- 
band filters. But due to their higher spectral resolution, we ex- 
pect setups A and B to contain more information per photon. 

In fact, it turns out, that the classification performance of 
all three setups is quite similar, which implies that the lack of 
photons in the medium bands is pretty much compensated by 
their higher information content (see Fig.^). Among the small 
remaining differences, there is a tendency for the medium- 
band setups to be more efficient in finding quasars, supposedly 
because their spectra contain emission lines which are more 
prominent in narrow filters. 

Also, there is a slight trend indicating that the medium- 
band surveys sustain a high level of completeness to somewhat 
fainter magnitudes and then drop more sharply than setup C. In 
the incompleteness range of very faint magnitudes, all setups 
perform rather equally meager. 

The same trends are more clearly present among the multi- 
color redshifts, where we compare the statistics for the ML es- 
timator (see Fig. 10): Setups A and B provide a much better 
redshift resolution at the usual working magnitudes. They only 
fall behind the performance of setup C by a rather insignificant 
degree in the faintest regime, where the redshift estimates are 
close to unusable to start out with. This advantage of setup C re- 
sults just from the broad bands being deeper by 0". u 6, where the 
medium-band filters do not contribute to the result anymore. 

For brighter objects, estimates in setup A are better than in 
setup B by an average factor of two, just reflecting the differ- 
ence in spectral resolution. After all, the convolution of any 
measurement with a 0^03-Gaussian (to account for the cal- 
ibration errors) makes better photon statistics useless among 
objects, which are detected at more than a ~ 30cr-level. Thus, 
only increasing the number of filters improves the result for 
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Fig. 9. Fraction Q of simulated objects which are correctly classified in the three different setups (solid line = setup A, grey line 
= setup B, dashed line = setup C). Except for faint stars, setup A and B are most successful. 
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Fig. 10. Variance of true redshift estimation error (Az = z mc — z) among simulated objects in the three different setups (solid 
line = setup A, grey line = setup B, dashed line = setup C) based on Maximum Likelihood estimate. Setups A and B provide the 
highest redshift resolution. Early type galaxies work better due to their higher continuum contrast at the 4000 A-break. Nearby 
quasars without continuum features do not work too well, since the redshift estimate has to rely on emission lines. 



these objects while increasing the depth of any filter has no ef- 
fect. 

At this point we like to emphasize, that the calibration un- 
certainty limits the best achievable performance. We stress, that 
a large calibration error of e.g. 10% would turn an entire survey 
catalog into a collection of "less-than-10-cr-objects", at least 
within our method. If calibration is expected to be a problem 
due to instrumentation or observing strategy, this conclusion 
strongly suggests that a large number of filters giving many 
noisy datapoints deliver more information than a few long ex- 
posed and formally deep filters that can not exactly be matched 
together. 

Once more we look into the details within classes: It is no 
surprise that quiescent galaxies with rather prominent 4000 A- 
breaks receive more accurate redshift estimates than starburst 
galaxies with less contrasty continuum features. When compar- 
ing equal accuracies, we find that estimates for quiescent galax- 
ies reach typically one magnitude deeper than for starburst ob- 
jects. When aiming for a redshift resolution of <j z 0.03 
among quiescent galaxies, it is interesting to see, that any of 
the medium-band surveys reaches two magnitudes deeper than 
the broad-band survey (setup C). 

The quasar redshifts work best at z > 2.2, when the estima- 
tion depends not only on emission lines but can take advantage 
of a strong continuum feature being present within the range of 
the filter set, i.e. the continuum suppression bluewards of the 
Lyman-a line. As in the case of galaxies, setups A and B have 



significantly stronger resolving power in terms of redshift than 
setup C, with setup A again being the best choice. 

It is inspiring to conclude from these simulations, that pho- 
tometric redshifts for quasars are feasible and are supposed to 
reach accuracies of a z ^0.1 in surveys with medium-band fil- 
ters. Furthermore, observations from the CADIS survey find a 
surprising number of faint quasars, whose multi-color redshifts 
were indeed proven by spectroscopy to be as accurate as ex- 
pected from the simulations (see Wolf et al. 1999 and paper 
II). 

Altogether, setup A seems to be the most successful among 
the ones discussed for photometric classification and redshift 
estimation. It has no disadvantages compared to the other se- 
tups, especially it does not lack working depth compared to the 
pure broad-band survey. Viewing the almost vanishing differ- 
ences between setup A and B, there might be no incentive to 
increase the number of filters even higher. 

Still, setup A shows a selection function for a success- 
ful classification with some redshift dependence. Among the 
quasars shown in Fig.||, we can see some vertical stripes con- 
taining objects at selected redshifts, which are not success- 
fully classified anymore, while the neighboring redshifts still 
work well. In principal, a set of neighboring medium-band fil- 
ters touching in wavelength and covering the important spec- 
tral range completely would most likely result in a selection 
function with the smoothest shape and smallest redshift depen- 
dence. 
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4.6. Analytic thoughts on filter choice 

In this section, we would like to address the issue of choosing 
an optimal filter set by analytic thoughts based on a simpli- 
fied picture of the classification problem. We assume, that we 
are still limited by a fixed amount of telescope time, which we 
can distribute over some filters. If different wavebands could be 
imaged simultaneously, it would be obvious that even a faintly 
exposed full-resolution spectrum would be better than an un- 
filtered white light exposure, as long as read-out noise of the 
recording detector is not an important constraint. Here, we want 
to discuss the less obvious scenario of consecutive exposures in 
different wavebands. 

As mentioned in the introduction, the choice of the opti- 
mum filter set depends entirely on the goal of the survey. For 
surveys aiming at a particular type of objects with characteris- 
tic colors, tailored filter sets can be designed. But if we intend 
to integrate different survey applications into one observational 
program on a common patch of sky, then we need a single sur- 
vey to identify virtually every object unambigously. In this sce- 
nario two choices have to be made: 

1 . Assuming constant filter width, the choice between 

a) either fewer filters with more photons each 

b) or more filters with fewer photons each. 

2. Assuming constant filter number, the choice between 

a) broad filters with more photons and less resolution 

b) narrow filters with less photons and more resolution. 

We first note, that if all colors were equally discriminating 
for each object, the choice would be arbitrary. Any distribution 
on any number of equally wide filters would provide the same 
total discriminative power and classification performance. In 
practice, objects can reside at many different redshifts and usu- 
ally only part of their spectra have discriminating features. 

We now try to obtain some insight into this question based 
on very basic template assumptions. For simplicity, we now just 
assume two different possible objects posed to the classification 
algorithm, with one of them being a quasar only distinguished 
by an emission line from another object with an otherwise iden- 
tical spectrum. 

Addressing choice (1), we find, that concentrating on few 
filters would mean that only few quasars display their emission 
line in a filter and can be classified correctly down to some 
limit, while many objects would be unclassifiable. The clas- 
sification would lack completeness, but reach deep for a few 
objects. Distributing exposure time among many filters cover- 
ing the entire spectrum would give every quasar a chance to 
show its emission line, which implies that every object is well 
classifiable but not to the same depth. The classification would 
remain rather complete and degenerate more sharply than in 
the case of few filters when reaching its limiting magnitude. 

Addressing question (2), we assume one of the filters to ob- 
serve the emission line and evaluate the line contrast obtained. 
As long as the line is completely contained in the filter band- 
pass, our signal, i.e. the absolute flux difference to the contin- 
uum induced by the line, is a constant value irrespective of the 



filter width. The noise is given by the square-root of the to- 
tal flux from the object which increases along with the width 
of the bandpass. The optimum signal-to-noise ratio is obtained 
with a filter matching just the width of the emission line. Any 
narrower filter would cut off line flux, thus shrinking the signal 
more than the noise. 

Using both conclusions we can ask for the optimum strat- 
egy when aiming for high sensitivity and completeness across 
some redshift range. This goal requires that we observe the 
emission line in any case regardless of the redshift. There- 
fore, we need n filters to cover the entire spectrum in ques- 
tion, depending on the filter width A A oc 1/n. Given a fixed 
total amount of exposure time, the exposure time per filter and 
thereby the counts measured from the line are Su ne oc 1 /n. 
The total flux S to t in this filter depends on the same exposure 
factor and on A A, so that the Stot K 1/n 2 and utot °c 1/n. 
Therefore, the signal-to-noise ratio Su ne /crtot = const, inde- 
pendent of the number of filters in any set providing complete 
coverage. 

In summary for the simple quasar example, we have a free 
choice on the filter set, as long as we cover the spectrum. It 
seems, that the width of the filters does not affect the magnitude 
limit for a successful classification, but it determines directly 
the redshift resolution. Having the free choice, many filters tai- 
lored to the typical width of quasar emission lines would be the 
best solution. 

Another example is photometric star-galaxy separation. 
Some red stars display broad-band colors similar to some red- 
shifted early-type galaxies. Good photometric accuracy is re- 
quired to tell them apart, especially if only few filters are avail- 
able. With medium-band filters enclosing the redshifted 4000- 
A-break of the galaxies and probing the absorption bands of 
stars, the two classes can easily be discriminated even at rather 
noisy flux levels. 

Let us assume the most general imaginable case for the 
classification problem, where the object spectra can have fea- 
tures with potentially any location and any width (due to red- 
shift as well as class). The arbitrary location calls for a filter 
set covering the entire spectrum. Again, we are left with the 
choice of many narrow versus few broad filters mentioned in 
the simple quasar example just above. And again, as long as 
the features are smaller than the filter width, the choice of fil- 
ters makes no difference to the classification, if the same total 
amount of telescope time is used. 

We now consider an abstract information value I obtained 
by a survey. It depends on the number of filters n, on the pho- 
tons collected in each of them N p h{f) and on the information 
I/N p h{f) that a single photon carries after passing through a 
given filter. If on average the same amount of information is 
obtained in every filter, we get: 

I = nxN ph (f)x^-(f). (29) 

For complete coverage the number n of filters again de- 
pends on the filter width AA oc 1/n. Given a fixed total amount 
of telescope time, the exposure time per filter is At oc 1/n and 
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thus the number of photons collected is N p h if) °c 1/ri 2 . Since 
narrow filters show features with more contrast than broad fil- 
ters, we can assume that the information per photon is inversely 
proportional to the filter width: I/N p h(f) oc 1/AA, and thus 
I/Nph{f) oc n. Altogether, the information content of the sur- 
vey results to: 



I = n x 1/n 2 x n = const . 



(30) 



In theory, the amount of information in terms of classifia- 
bility of objects depends only on the total telescope time and 
not on the characteristic width of the filters, as long as they 
cover the entire spectral range in question. The smaller num- 
ber of photons in the medium-band survey is compensated by 
the larger number of filters and the higher information content 
per photon. But this conclusion is based on three simplified as- 
sumptions: 

- a much simplified picture of object spectra 

- photon noise be the only source of measurement error 

- libraries resembling true nature accurately and completely 

In practice, there are several advantages for medium-band 
and mixed surveys compared to broad-band surveys, especially 
when combined with our classification scheme: 

- medium-band surveys always provide better redshift reso- 
lution 

- medium-band surveys perform much better at the limit of 
calibration accuracy by providing many more datapoints 
with higher spectral resolution 

- medium-band surveys can tolerate inaccurate libraries and 
cosmic variance much better for the same reason 

- medium-band surveys sampling the the wavelength range 
somewhat sparsely (as the model surveys A and B in our 
simulations) can have their filters placed to avoid strong 
night sky emission lines and to suppress background noise 

Especially the last three advantages can cause a medium- 
band survey to reach even deeper than a broad-band survey 
in terms of classification and redshift estimation, although its 
nominal flux detection limits might have suggested inferior per- 
formance to the intuitive judgement. 

The disadvantage of a survey project involving many 
medium-band filters is, that it needs a larger minimum amount 
of telescope time, since a few constraints in observational strat- 
egy have to be met. An optimal survey has requirements for: 

- a minimum number of exposures per filter to eliminate a 
fringe pattern and to close gaps in a CCD mosaic 

- a minimum exposure time for every frame in order to avoid 
being limited by read-out noise 

- a minimum number of filters derived from a certain mini- 
mum coverage of a wide spectral range. 

5. Conclusions for real multi-color applications 



5.1. Calibration of colors 

Obviously, the measurements also need a careful calibration 
among the wavebands. A large erroneus offset can be disastrous 
for the photometric classification of narrow class structures in 
color space. If, e.g., true stars were measured with shifted col- 
ors, the classification would potentially find it rather in the lo- 
cation of library galaxies or quasars, and vice versa. Also, the 
redshift estimates would be thrown off by color offsets. 

Calibration problems are of greatest concern, when rare ob- 
jects are searched and their class gets contaminated. Especially, 
when class volumes are almost touching in color space, already 
small calibration errors can push objects into the wrong class. 
E.g., in many filter sets the quasar class is not well separated 
from stars and galaxies. In the presence of a calibration er- 
ror, abundant galaxies can be pushed into the quasar class po- 
tentially making up the largest population among the precious 
candidates. The shape of class volumes is likely to cause quite 
some redshift dependence for the contamination. Then objects 
in some redshift range can become virtually unidentifiable, if 
they are overwhelmed by contaminants. 

If calibration errors were known and quantified, they could 
as well be removed. If they were present but not realized, the 
measurements would look too accurate and a seemingly faith- 
ful classification would be derived, which is potentially wrong. 
Thus, as long as the calibration errors are unknown, it is still 
important to take their potential size into account for the error 
estimates on which the classification is based. As a result, the 
performance of the classification for bright objects is indeed 
limited by the calibration error. 

We assume calibration errors on the order of 3% for the col- 
ors in our surveys, which implies that the quality of the classifi- 
cation saturates for objects that are more than 1™5 brighter than 
the 10-er-limits of the survey. On the other hand, if we assume 
for the moment poor data reduction or uncorrected galactic red- 
dening changing the colors by, e.g., 10%, this would turn an en- 
tire survey catalog into a collection of "less-than-10-er-objects" 
— a devastating effect for the survey quality. 

An accurate relative calibration among many wavebands is 
best ensured by establishing a few spectrophotometric standard 
stars in each of the survey fields, a successful approach that we 
have made into a standard procedure in CADIS. This task can 
be carried out in a photometric night by taking spectra of the 
new standar ds and connecting these to published standard stars 
( Oke 1990 ). This way, spectrophotometric standards are avail- 
able in every one of the survey exposures, which will not re- 
quire any further calibartion efforts regardless of the conditions 
under which the regular imaging is carried out. Obviously, stan- 
dard star spectra are supposed to cover the entire filter set, but 
if a mixture of (e.g. optical and infrared) instruments is used, 
the calibration will involve different procedures to be matched. 



5.2. The optimum survey strategy 

The most basic result of our study on the performance of differ- 
ent multi-color surveys is, that even for small systematic errors 
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in the color indices of s a 



0™03, a survey with 17 bands 



performs better in classification and redshift estimation than 
one with only few bands. For the 17 band case we found that 
the limiting magnitude for reasonable performance is reached 
when the typical statistical (i.e. photon noise) errors are on the 
order of 10%. It is obvious, that larger systematic errors will 
worsen the performance and will allow even higher statistical 
errors before the survey deteriorates significantly. For the sur- 
vey strategy this implies that pushing the statistical errors in 
each band well below the systematic errors will add nothing to 
the survey performance. When Ati nt is the integration time re- 
quired to reach a w l/2s g -h, the optimum number of bands 
N for a given amount of total time t to t is roughly 



N = hot/ Atint 



(31) 



Although our present study has been confined to the wave- 
lengths region attainable by optical CCDs and did not address 
the total wavelength coverage of the survey explicitely, it is pre- 
dictable that further bands extending the wavenlength coverage 
(e.g. by adding NIR bands) will have a larger effect than split- 
ting the optical bands. In particular, the maximum redshift for 
a reliable galaxy classification will be extended. 

As the color indices are the prime observables entering the 
classification and redshift estimation process, it is clear that any 
multi-color survey has to be processed such, that these indices 
are measured in an optimum way. For ground-based observa- 
tions it is of great importance to avoid that variable observing 
conditions introduce systematic offsets between bands when 
the observations are taken sequentially. First of all, this requires 
to assess the seeing point spread function on every dataset very 
carefully. Second, one has to correct for the effect of variable 
seeing which might influence the flux measurement of star-like 
and extended objects in a different way. 

In CADIS, we essentially convolve each image to a com- 
mon effective point spread function and measure the central 
surface brightness of each object (see paper II for details). This 
has the disadvantage, that the spatial resolution (i.e. the mini- 
mum separation of objects neighboring each other) is limited 
by the data with the worst seeing. However, it is not clear 
whether the obvious alternative — deconvolution techniques 
— can be optimized such that the systematic errors can be kept 
below a few percent for a wide variety of objects. 

The performance of the MEV estimator depends critically 
on the assumption that not only the color indices but also their 
errors are determined correctly. For the survey strategy this im- 
plies, that an optimization of the photon noise errors under the 
expense that an accurate estimation of these errors is no longer 
possible may lead to worse performance than slightly larger er- 
rors which are known accurately. 

5.3. Ongoing applications and their scientific goals 

In this section, we want to mention examples for survey appli- 
cations using this method and comment on the usefulness of 
our classification approach. A number of multi-color surveys 
have been conducted, where filters and exposure times were 



chosen to match some primary survey strategy. Although, none 
of these might have been optimal choices in terms of a general 
classification, we used or intend to use our approach to extract 
class and redshift data on the objects contained. These surveys 
are in chronological order of their beginning: 



1. 



The Calar Alto Deep Imaging Survey (CADIS): Three 
broad-band and twelve medium-band filters have mostly 
been chosen to match the needs of the emission line sur- 
vey in CADIS, while some of them fill in gaps in the spec- 
tral coverage. The multi-color part of CADIS has been used 
to study the evolution of the galaxy luminosity function at 
z Ss 1, to search for quasars at all visible redshifts and to 
use the observed faint stellar population to check models 
of the Galactic structure and the stellar luminosity function 
(jMeisenheimer et al. 19981; IWolf et al. 19991 Fried et al. 



2000; phleps et al. 200q ; |Wolf et al. 2000a[ ). 

2. A lensing study of the galaxy cluster Abell 1689: Two 
broad-band and seven medium-band filters have been cho- 
sen to separate well between the cluster galaxies at z « 
0.19 and the background population. The galaxy luminos- 
ity function in the background of the cluster is compared to 
a control field taken from CADIS, and the cluster mass is 
estimate d from weak len sing effects on the apparent lumi- 
nosities ( |Dye et al. 2000| ). 

3. A widefield project for Measuring Agn redshifts by 
Medium-Ban d O bservations (MAMBO): The filters (setup 
A from Sect. 4T ) are chosen to provide a selection function 
and a redshift accuracy for quasars and galaxies, which is 
as independent of redshift as feasible. The data will be used 
to study the faint end of the quasar luminosity function at 
all accessible redshifts z iS 1 and galaxy-quasar correla- 
tion at z ^ 1, as well as weak lensing effects in the cluster 
group Abell 901/2 and in the open galaxy field (Wolf et al. 
2000b). 

4. The Sloan Digital Sky Survey (SDSS): Five broad filters 
have been chosen, which span the entire range of presently 
available CCD sensitivity. We intend to apply our classifi- 
cation to search for quasars and to separate stars from com- 
pact galaxies, where morphology data are not sufficient. 

From simulations of the classification scheme presented in 
this paper, we expect in all these projects, that we should be 
able to classify virtually all objects above some magnitude limit 
purely by color, and that especially the medium-band surveys 
should have selection functions which are not very dependent 
of redshift. This way, we can omit morphological criteria for 
defining catalogs of the stellar vs. galaxy vs. quasar popula- 
tion. This conclusion leads to a number of advantages for our 
method, we like to state explicitely here: 

- The star-galaxy separation reaches deeper and avoids con- 
fusion better than if based on morphology. Accordingly, 
studies of the stellar population and of the galaxy popu- 
lation can be extended to much fainter magnitude. 

- Quasars can be found at all accessible redshifts, especially 
in the medium-band surveys, where the overlap with the 
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stellar sequence at redshifts of 2.2 ^ z 3.5 is much 
reduced in the color space. 

- Quasars do not need to be selected from a subcatalog 
of point sources only, allowing for objects with resolved 
host galaxies to be found, including high-redshift Seyfert I 
galaxies. 

- We also believe that the multi-color redshifts from the 
medium-band surveys are accurate enough so that the 
applications listed above do not require follow-up spec- 
troscopy. E.g., the evolution of the galaxy luminosity func- 
tion can be analysed with the multi-color redshifts, as long 
as no trends in a redshift intervall Az < 0.1 are searched 
for. 

6. Summary 

We presented an innovative method that performs a multi-color 
classification and redshift estimation of astronomical objects in 
a unifying approach. The method is essentially based on tem- 
plates and evaluates the statistical consistency of a given mea- 
surement with a database of spectral knowledge, serving as a 
second, very crucial input to the algorithm. 

The introduction of this method is motivated by the quest 
for a statistically correct extraction of the information present 
in the color vectors of surveys with many filters. The method 
is derived from basic statistical principles and calculates prob- 
ability density functions for each survey object telling us two 
different results simultaneously: the class membership and red- 
shift estimates according to the Maximum Likelihood (ML) 
and Minimum Error Variance (MEV) estimators. We add our 
own version of the MEV technique featuring improved han- 
dling of bimodalities in the probability function. 

Our choice for the database is a large, systematically 
ordered library containing templates for stars, galaxies and 
quasars, which are supposed to cover virtually all but some 
unusual members among each of the three object classes. The 
libraries were established from a few model assumptions and 
templates published by various authors and extracted from the 
literature. 

The method can be implemented in a computationally very 
efficient way, by using directly color indices as object features. 
We showed that our color-based approach is expected to deliver 
results consistent with those from flux-based template-fitting 
algorithms. 

The accuracy of the data calibration is a very important is- 
sue, constraining the design of the libraries and limiting the 
maximum achievable performance of the method via the effec- 
tive photometric quality. Calibration errors can distort results 
and shrink the information output. 

We carried out Monte-Carlo simulations for three model 
surveys using the same total exposure time but different filter 
sets. One of them is a UBVRI broad-band survey, while the 
other two expose two third of the time in various medium- 
band filters. Altogether, the performance of all three setups 
was rather similar despite the quite different numbers of col- 
lected photons. So it appears, that medium-band filters obtain 



more information per photon and thereby compensate the loss 
of depth in terms of flux detection, from which they suffer in 
comparison to broad bands. Among the differences, medium- 
band surveys performed better than the broad-band survey for 
finding quasars, and they provided much higher redshift res- 
olution in their estimates. Also, in the presence of calibration 
errors or uncorrected reddening effects, bright objects are not 
easier to classify than faint ones, and a large number of shallow 
filters might provide more information than a small number of 
deeply exposed filters. 

Based on simple analytic assumptions, we have discussed 
the relative information content of surveys with different char- 
acteristic filter width. All surveys using the same amount of 
total telescope time and filter sets stretching over the entire 
spectral range of interest, should perform equal in terms of 
classification. This theoretical conclusion depends on perfect 
calibration and perfect template knowledge. 

In practice, the classification should reach deeper in 
medium-band surveys than in broad-band surveys, because the 
former are less affected by inaccuracies in the calibration and 
in the template library. Furthermore, the filters can be chosen to 
avoid noise from strong night sky emission lines which is not 
possible with broad-band filters. 

In particular, using the proposed statistical classification ap- 
proach in a suitable medium-band survey it should be possible 

- to separate stars from apparently compact galaxies down to 
rather deep limits exceeding the potential of morphological 
classification, 

- to find quasars rather efficiently and completely, i.e. with 
very little contamination, and 

- to obtain quite good multi-color redshift estimates with er- 
rors on the order of a z w 0.1 for z > 2-quasars, and on the 
order of a z ~ 0.03 for galaxies. 

This method should be very suitable for many survey-type 
applications, which usually require only low spectral resolution 
and finite accuracy in the derivation of physical parameters, but 
aim for large samples to feed statistical studies and to search for 
rare and unusual objects. Of course, if you need a 100% sure 
confirmation on the nature of an individual object, or if you aim 
for high resolution studies, it gives you only a preselection of 
candidates. 

In paper II we show, that this method is very powerful and 
indeed of great practical relevance for multi-color surveys with 
many filters like in the case of CADIS. The results of our shown 
simulations compare well with the performance of a real sur- 
vey, and therefore, they can in fact be used for testing the per- 
formance of future survey designs. 
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